Speech Events are Recoverable from Unlabeled Articulatory Data: Using an Unsupervised Clustering Approach on Data Obtained from Electromagnetic Midsaggital Articulography (EMA)
نویسندگان
چکیده
Some models of speech perception/production and language acquisition make use of a quasi-continuous representation of the acoustic speech signal. We investigate whether such models could potentially profit from incorporating articulatory information in an analogous fashion. In particular, we investigate how articulatory information represented by EMA measurements can influence unsupervised phonetic speech categorization. By incorporation of the acoustic signal and non-synthetic, raw articulatory data, we present first results of a clustering procedure, which is similarly applied in numerous language acquisition and speech perception models. It is observed that nonlabeled articulatory data, i.e. without previously assumed landmarks, perform fine clustering results. A more effective clustering outcome for plosives than for vowels seems to support the motor view of speech perception.
منابع مشابه
Towards unsupervised articulatory resynthesis of German utterances using EMA data
As part of ongoing research towards integrating an articulatory synthesizer into a text-to-speech (TTS) framework, a corpus of German utterances recorded with electromagnetic articulography (EMA) is resynthesized to provide training data for statistical models. The resynthesis is based on a measure of similarity between the original and resynthesized EMA trajectories, weighted by articulatory r...
متن کاملCo-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging.
This paper describes a spatio-temporal registration approach for speech articulation data obtained from electromagnetic articulography (EMA) and real-time Magnetic Resonance Imaging (rtMRI). This is motivated by the potential for combining the complementary advantages of both types of data. The registration method is validated on EMA and rtMRI datasets obtained at different times, but using the...
متن کاملArticulatory VCV Synthesis from EMA Data
This paper reports experiments in synthesizing VCV sequences with French unvoiced stop or fricative consonants, using a time-domain simulation of the vocal-tract system. The necessary dynamics of the vocal-tract shape are derived in two steps: first, time-varying parameters of an articulatory model are calculated automatically from electromagnetic articulography (EMA) data, using a method previ...
متن کاملOn the correlation between facial movements, tongue movements and speech acoustics
This study is a first step in a large-scale study that aims at quantifying the relationship between external facial movements, tongue movements, and the acoustics of speech sounds. The database analyzed consisted of 69 CV syllables spoken by two males and two females; each utterance was repeated four times. A Qualysis (optical motion capture system) and an EMA (electromagnetic midsaggital artic...
متن کاملCritical Articulators Identification from RT-MRI of the Vocal Tract
Several technologies, such as electromagnetic midsagittal articulography (EMA) or real-time magnetic resonance (RTMRI), enable studying the static and dynamic aspects of speech production. The resulting knowledge can, in turn, inform the improvement of speech production models, e.g., for articulatory speech synthesis, by enabling the identification of which articulators and gestures are involve...
متن کامل